Inside-Outside Estimation of a Lexicalized PCFG for German
نویسندگان
چکیده
The paper describes an extensive experiment in inside-outside estimation of a lexicalized proba-bilistic context free grammar for German verb-final clauses. Grammar and formalism features which make the experiment feasible are described. Successive models are evaluated on precision and recall of phrase markup.
منابع مشابه
Disambiguation of Morphological Structure using a PCFG
German has a productive morphology and allows the creation of complex words which are often highly ambiguous. This paper reports on the development of a head-lexicalized PCFG for the disambiguation of German morphological analyses. The grammar is trained on unlabeled data using the Inside-Outside algorithm. The parser achieves a precision of more than 68% on difficult test data, which is 23% mo...
متن کاملValence Induction with a Head-Lexicalized PCFG
Either directly or indirectly, the lexicon for a natural language specifies complementation frames or valences for open-class words such as verbs and nouns. Constructing a lexicon of complementation frames for large vocabularies constitutes a challenge of scale, with the further complication that frame usage, like vocabulary, varies with genre and undergoes ongoing innovation in a living langua...
متن کاملScalable Discriminative Parsing for German
Generative lexicalized parsing models, which are the mainstay for probabilistic parsing of English, do not perform as well when applied to languages with different language-specific properties such as free(r) word order or rich morphology. For German and other non-English languages, linguistically motivated complex treebank transformations have been shown to improve performance within the frame...
متن کاملSpatial Random Trees and the Center-Surround Algorithm
A new class of multiscale stochastic processes called spatial random trees (SRTs) is introduced and studied. As with previous multiscale stochastic processes, SRTs model multidimensional signals using random processes on trees. Our key innovation, however, is that the tree structure itself is random and is generated by a probabilistic context-free grammar (PCFG) [26]. While PCFGs have been used...
متن کاملLexicalization in Crosslinguistic Probabilistic Parsing: The Case of French
This paper presents the first probabilistic parsing results for French, using the recently released French Treebank. We start with an unlexicalized PCFG as a baseline model, which is enriched to the level of Collins’ Model 2 by adding lexicalization and subcategorization. The lexicalized sister-head model and a bigram model are also tested, to deal with the flatness of the French Treebank. The ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره cs.CL/9905009 شماره
صفحات -
تاریخ انتشار 1999